Assignment 1: Colorizing the Prokudin-Gorskii Photo Collection

Course: Learning-Based Image Synthesis

Zoƫ LaLena

Assignment Overview

Background

Sergey Mikhaylovich Prokudin-Gorsky had the foresight to realize that color photography would be the way of the future. An estimated 3,500 negtives were in Sergey Mikhaylovich Prokudin-Gorsky's possession when he left Russia but only 1,902 negatives were purchased by the Library of Congress in 1948, the others lost and/or confiscated by the Russian government. Our goal in this project is to use the digital scans of these images to create color images. For each subject used in this project we have images taken through a red, green and blue filter. By aligning and combining these three images we can get a color image.

Method

We first implement a simple search method on a smaller image from the collection. We are aligning both the green and red images to the blue (filter). We chose a range over which to shift the red and green images. In our case [-15, 15] in both the x and y directions. We shift the image and find the normalized cross correlation (NCC) between the image and the filter for all combinations of shifts in the x and y direction over our range. The shift with highest NCC value is our final transform for a given image. Once the green and red images are aligned with the blue they can be merged in to a color image.

For the larger tiff images our first implementation is too slow. We need to search for the shift amount more efficiently. We utilize an image pyramid to do this. Our image pyramid is stack that has the original image at the bottom, each consecutive image is half the size of the last. We start our search for the shift on the smallest image, and use our estimate to refine our search over the next largest image. We repeat this until we are at the original resolution. To speed up the process we also gradually decrease the range we are searching over.

All shift values will be provided in (row, column) or (y,x) format. Only some color images without alignment will be shown to avoid unnecessary repetition.

Single Scale

Cathedral.jpg

Without Alignment
With Alignment

Green Shift Values: (5, 2)

Red Shift Values: (12, 3)

Pyramid Implementation

Icon

Without Alignment
With Alignment

Green Shift Values: (137, 21)

Red Shift Values: (64 ,11)

Harvesters

With Alignment

Green Shift Values: (121, 13)

Red Shift Values: (57 ,16)

Lady

With Alignment

Green Shift Values: (100, 12)

Red Shift Values: (48 ,9)

Self Portrait

With Alignment

Green Shift Values: (175, 36)

Red Shift Values: (77, 28)

Three Generations

With Alignment

Green Shift Values: (109, 11)

Red Shift Values: (50, 13)

Train

With Alignment

Green Shift Values: (85, 31)

Red Shift Values: (40, 5)

Turkmen

With Alignment

Green Shift Values: (108, 26)

Red Shift Values: (54, 20)

Village

With Alignment

Green Shift Values: (137, 21)

Red Shift Values: (64, 11)

3 New Images

Monastery

Without Alignment
With Alignment

Green Shift Values: (138, 21)

Red Shift Values: (63, 13)

Lamp

With Alignment

Green Shift Values: (52, 24)

Red Shift Values: (5, 13)

Peasant Girls

With Alignment

Green Shift Values: (10, 17)

Red Shift Values: (-17, 10)

Failure Case

Bells and Whistles 1: Canny Edge Detection

Emir

Without Alignment
With Alignment

Green Shift Values: (68, 45)

Red Shift Values: (48, 24)

Why does it fail?

This image in particular does not work with our NCC method. Emir's separate channels differ too much in intensity causing poor results. We can fix this by basically removing most of the information. Let's use a Canny Edge detector, so we just have white edges and black backgrounds. It will be easier to align the edges than the original image.

With Canny Edge Detection

Green Shift Values: (107, 40)

Red Shift Values: (49, 24)

Auto Cropping

Bells and Whistles 2

As seen in the results of the previous sections, due to the way the images were taken we have strips of very high or very low intensity colors around the images. We can remove a great deal of this effect with a simple algorithm. It is not perfect but can successfully remove a lot of the strips by just averaging.

For each row in the image we take the average intensity value for each channel. If any of the channels have a really low (<20) or high (>235) average value, we do not add that row into our cropped image. We then do the same thing for the columns.

Peasant Girls

Without Cropping
With Cropping

Turkmen

With Cropping

Lady

With Cropping

Train

This works in most cases, but in images like the train, where the background is very uniform, it can remove too many lines of pixels.

Without Cropping
With Cropping

New Data: Palimpsests

Bells and Whistles 3

When looking at the NASA data I remembered I have a ton of image data taken through different filters. I have done a lot of research in the imaging of medieval palimpsest documents. This process included illuminating documents with UV light and imaging them through various filters. While switching out filters it's possible the document moved slightly, but the biggest problem was filters were different thickness and colors so the way the light would hit the camera sensor would be slightly different for each filter. So all images needed to be aligned before further post-processing was done. I did this process with OpenCV several years ago but let's see how this method works instead. What wavelength these filters passed is lost in a drive somewhere, so I picked 3 random images taken with different filters.

Icon

Filter 3C1
Filter 3C10
Filter XC4
Without Alignment
With Alignment

Green Shift Values: (5, -1)

Red Shift Values: (4, -2)

The aligned difference is very small, but crucial. We are looking for evidence of removed text on documents like this. If the pseudo-color version is even slightly misaligned it will be very hard for a scholar to read the already faint text (perpendicular to the bold text) that we are interested in.

Note: That since we are looking for evidence of removed text, our pseudo-color version does not need to look realistic, it needs to emphasize the removed text, circled below.